Module 1

Introduction to Databases

Databases and database technology play a major role in modern computing. Almost every field that uses computers depends on databases to store, manage, and retrieve data efficiently.

Databases are used in:

Without databases, managing large volumes of data would be slow, error-prone, and inefficient.

Database

Definition

A database is a collection of related data. By data, we mean known facts that can be recorded and have an implicit meaning.

Example of data:

These facts become useful information when they are stored in an organized manner.

simplified_database_system

(Simplified Database System)

Implicit Properties of a Database

A database has several important implicit properties:

1. Representation of the Real World

A database represents some aspect of the real world known as:

Any change in the real world (miniworld) is reflected in the database.
Example: If a student changes address, the database record is updated.

2. Logical Coherence

A database is a logically coherent collection of data. Random or unrelated data cannot form a database.

3. Designed for a Specific Purpose

Every database is:

for a specific group of users and applications.

Database Summary

A database can be summarized as:

Databases evolve over time as the real world and requirements change.

Size and Complexity of a Database

Databases can vary greatly in size and complexity.

Small Databases

Large Databases

Amazon database stores data for millions of products such as books, electronics, clothing, etc.

Manual vs Computerized Database

Manual Database

Computerized Database

Large and complex databases always require a Database Management System.

Database Management System (DBMS)

Definition

A Database Management System (DBMS) is a collection of programs that enables users to create, maintain, and access a database.

DBMS acts as an interface between:

Functions of DBMS

1. Defining the Database

DBMS allows defining:

This information is stored in a database catalog / data dictionary, also known as metadata.

2. Constructing the Database

Data is stored on a physical storage device controlled by the DBMS.

3. Manipulating the Database

4. Sharing the Database

Multiple users and applications can access the database simultaneously.

Additional DBMS Functions

Protection

Maintenance

Databases are long-term systems. DBMS supports:

A database + DBMS together form a database system.

Example: UNIVERSITY Database

Consider a UNIVERSITY database used to store information about students, courses, and grades.

Files in the Database

  1. STUDENT - stores student details
  2. COURSE - stores course information
  3. SECTION - stores course section details
  4. GRADE_REPORT - stores student grades
  5. PREREQUISITE - stores course prerequisites

Each file contains records of the same type and is logically related to others.

COURSE
Course_name Course_number Credit_hours Department
Intro to Computer Science CS1310 4 CS
Data Structures CS3320 4 CS
Discrete Mathematics MATH2410 3 MATH
Database BCS403 3 CS
SECTION
Section_identifier Course_number Semester Year Instructor
92 BCS403 Fall 04 Pruthviraj
85 MATH2410 Fall 04 Rajshekhar Hammigi
102 CS3320 Spring 05 G C DIVYA
112 MATH2410 Fall 05 Rajshekhar Hammigi
119 CS1310 Fall 05 SAMEER B
135 CS3380 Fall 05 MANJUNATH K G
GRADE_REPORT
Student_number Section_identifier Grade
17 112 B
17 119 C
8 85 A
8 92 A
8 102 B
8 135 A
PREREQUISITE
Course_number Prerequisite_number
CS3380 CS3320
CS3380 MATH2410
CS3320 CS1310

Defining a UNIVERSITY Database

Defining a database means specifying the structure of records stored in each file and the data elements contained in those records.

Record Structure Specification

Each file in the UNIVERSITY database contains records of a specific type. The attributes (fields) of each record must be clearly defined.

Data Type Specification

Each data element must have an appropriate data type.

Constructing the UNIVERSITY Database

Constructing the database involves storing actual data as records in the appropriate files defined earlier.

Data Storage

Relationships Between Records

Records in different files are logically related.

These relationships allow meaningful retrieval of information from the database.

Manipulating the UNIVERSITY Database

Database manipulation involves querying and updating the database.

Query Operations

Examples of database queries include:

Update Operations

Examples of updates include:

Queries and updates must be specified precisely using the DBMS query language (such as SQL).

Database Design Process

Designing a database follows a structured process to ensure accuracy, flexibility, and efficiency.

1. Requirements Specification & Analysis

2. Conceptual Design

Represents data at a high level using models such as the ER model.

3. Logical Design

Converts conceptual design into a data model supported by a DBMS (e.g., relational model).

4. Physical Design

The database is then implemented, populated, and continuously maintained.

Database Approach vs File Processing Approach

In traditional file processing, each department maintains its own files and applications.

Problems with File Processing Approach

Database Approach

A DBMS ensures controlled access to shared data by multiple users.

Characteristics of the Database Approach

Self-Describing Nature of a Database System

A database system stores not only the data but also the description of the database structure.

This description is called metadata and is stored in the system catalog.

The system catalog is used by both users and the DBMS software to correctly interpret stored data.

Program - Data Independence and Data Abstraction

Program - Data Independence

In DBMS, data structure definitions are separated from application programs.

Changes in data structure do not require modification of application programs.

Program - Operation Independence

Operations are defined separately from their implementation. Applications use operation names without knowing how they are implemented.

Data Abstraction

Data abstraction hides storage and implementation details from users.

A data model provides a conceptual view using objects, attributes, and relationships.

Support of Multiple Views of the Data

A database typically has many users, and each user may require a different view of the database.

A view is a subset of the database or a virtual representation derived from the database that is not explicitly stored.

Need for Multiple Views

Example

In a UNIVERSITY database, one user may be interested only in viewing and printing student transcripts.

This user's view contains only student details, courses, and grades, even though the database stores much more information.

A multiuser DBMS must provide facilities to define and manage such views.

Sharing of Data and Multiuser Transaction Processing

A multiuser DBMS allows multiple users to access the database simultaneously.

This is essential when data for multiple applications is integrated and maintained in a single database.

Concurrency Control

The DBMS must include concurrency control mechanisms to ensure correct execution when multiple users update the same data.

Example

In an airline reservation system, multiple agents may try to assign the same seat.

The DBMS ensures that each seat is assigned to only one passenger.

OLTP Applications

Applications that involve frequent, short database transactions are called Online Transaction Processing (OLTP) applications.

A major role of a multiuser DBMS is to ensure that concurrent transactions execute correctly and efficiently.

Transaction Concept and Properties

A transaction is an executing program or process that includes one or more database operations such as reading or updating data.

Transaction Properties

Atomicity

Ensures that all operations of a transaction are executed completely or none are executed.

Isolation

Ensures that each transaction appears to execute independently of other concurrent transactions.

Even when many transactions execute simultaneously, their effects do not interfere with each other.

Database Users

Database users can be broadly classified into different categories based on their interaction with the database system.

Main Categories

Actors on the Scene

1. Database Administrator (DBA)

The DBA is the chief administrator responsible for managing the database system.

2. Database Designers

Responsible for identifying data to be stored and organizing it efficiently.

3. End Users

End users access the database for querying, updating, and reporting.

Types of End Users

Casual End Users

Use the database occasionally and require different information each time.

Naive / Parametric End Users

Use predefined transactions frequently.

Sophisticated End Users

Use DBMS facilities to develop their own applications.

Stand-Alone Users

Maintain personal databases using ready-made software packages.

System Analysts and Application Programmers

System Analysts

Analyze user requirements and design specifications for database applications.

Application Programmers

Implement, test, document, and maintain application programs.

Workers Behind the Scene

1. DBMS System Designers and Implementers

Design and implement DBMS software components.

2. Tool Developers

Develop tools for database design, modeling, and performance tuning.

3. Operators and Maintenance Personnel

Responsible for running and maintaining the hardware and software environment.